Learning the areas of expertise of classifiers in an ensemble

نویسندگان

  • Esma Kilic
  • Ethem Alpaydin
چکیده

There are various machine learning algorithms for extracting patterns from data; but recently, decision combination has become popular to improve accuracy over single learner systems. The fundamental idea behind combining the decisions of an ensemble of classifiers is that different classifiers most probably misclassify different patterns and by suitably combining the decisions of complementary classifiers, accuracy can be improved. In this paper, we investigate two kinds of classifier systems which are capable of estimating how much to weight each base classifier dynamically; during the calculation of the overall output for a given test data instance: (1) In “referee-based system”, a referee is associated with each classifier which learns the area of expertise of its associated classifier and weights it accordingly. (2) However, “gating system” learns to partition the input space among all classifiers. Each referee in referee-based system learns a two-class problem (i.e., whether to use or not to use a classifier) whereas a gating system learns an L-class problem assigning the input to one of L base classifiers. Our analysis on 20 datasets from different domains and a classifier pool including 21 base learning algorithms reveals that the gating system tends to concentrate on a few of the base classifiers whereas a use of referees leads to a more balanced use of the base classifiers. Moreover, in the case of referees, it is better to use a small subset of base classifiers, instead of a single one. The study shows that, by using well-trained selection unit (referee or gating), we can get as high accuracy as using all the base classifiers (to combine their decisions) with drastic decrease in the number of base classifiers used, and also improve accuracy. The improvement is significant especially in cases when none of the base classifiers has high accuracy and it indicates that selecting classifiers appears promising as a means of solving hard learning problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of ensemble learning techniques to model the atmospheric concentration of SO2

In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...

متن کامل

A Novel Ensemble Approach for Anomaly Detection in Wireless Sensor Networks Using Time-overlapped Sliding Windows

One of the most important issues concerning the sensor data in the Wireless Sensor Networks (WSNs) is the unexpected data which are acquired from the sensors. Today, there are numerous approaches for detecting anomalies in the WSNs, most of which are based on machine learning methods. In this research, we present a heuristic method based on the concept of “ensemble of classifiers” of data minin...

متن کامل

Combining Classifier Guided by Semi-Supervision

The article suggests an algorithm for regular classifier ensemble methodology. The proposed methodology is based on possibilistic aggregation to classify samples. The argued method optimizes an objective function that combines environment recognition, multi-criteria aggregation term and a learning term. The optimization aims at learning backgrounds as solid clusters in subspaces of the high...

متن کامل

Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection

Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...

متن کامل

A Hybrid Framework for Building an Efficient Incremental Intrusion Detection System

In this paper, a boosting-based incremental hybrid intrusion detection system is introduced. This system combines incremental misuse detection and incremental anomaly detection. We use boosting ensemble of weak classifiers to implement misuse intrusion detection system. It can identify new classes types of intrusions that do not exist in the training dataset for incremental misuse detection. As...

متن کامل

A Pre-Trained Ensemble Model for Breast Cancer Grade Detection Based on Small Datasets

Background and Purpose: Nowadays, breast cancer is reported as one of the most common cancers amongst women. Early detection of the cancer type is essential to aid in informing subsequent treatments. The newest proposed breast cancer detectors are based on deep learning. Most of these works focus on large-datasets and are not developed for small datasets. Although the large datasets might lead ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011